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Unlike fixed designs, programmable circuit designs support an infinite number of operators. The functionality 
of a programmable circuit can be altered by simply changing the angle values of the rotation gates in the 
circuit. Here, we present a new quantum circuit design technique resulting in two general programmable 
circuit schemes. The circuit schemes can be used to simulate any given operator by setting the angle values 
in the circuit. This provides a fixed circuit design whose angles are determined from the elements of the 
given matrix-which can be non-unitary-in an efficient way. We also give both the classical and quantum 
complexity analysis for these circuits and show that the circuits require a few classical computations. They 
have almost the same quantum complexities as non-general circuits. Since the presented circuit designs are 
independent from the matrix decomposition techniques and the global optimization processes used to find 
quantum circuits for a given operator, high accuracy simulations can be done for the unitary propagators of 
molecular Hamiltonians on quantum computers. As an example, we show how to build the circuit design for 
the hydrogen molecule. 



> 
o 
o 

m 
o 

(N 



X 



I. INTRODUCTION 

The classical logical devices can be broadly categorized 
as fixed and programmable devices. As we understand 
from their names, the circuits in a fixed logic can only 
support one function which is determined at the time of 
manufacture. This cannot be changed at a later day. On 
the other hand, programmable devices such as PLDs and 
FPGAs are able to support an infinite number of func- 
tionalities since they can be reconfigured outside of the 
manufacturing environment. With this feature designers 
and programmers can run and simulate their test designs 
and algorithms ^ 

Quantum computing has become a huge new interdis- 
ciplinary area by providing different approaches and pro- 
tocols to various subfields including: communication, en- 
cryption, global binary optimization (see adiabatic quan- 
tum computing 2 ), linear algebra, and so on^r— ; however, 
programmable quantum circuits and chip designs like 
those in classical computers have remained an open is- 
sue. 

In the circuit model of quantum computing, unitary 
matrix operators represent the algorithms or some part 
of the computations 6 . Hence, one of the fundamental 
issues is to have a general purpose quantum circuit or 
a quantum chip that can realize different types of algo- 
rithms in a fast and an efficient way. The possibility 
of designing universal quantum gate arrays as a general 
purpose quantum computer has been discussed in refi 
It is shown that a gate array can be programmed to eval- 
uate the expectation value of a given operator—. For the 
realization of a quantum gate, a cell structured quantum 
circuit design based on the activation and the deactiva- 
tion of the gates on different qubits is proposed: It is 
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shown that a combination of such cells can be used to 
realize a given quantum gate sequence 9 . Moreover, dif- 
ferent schemes of general programmable universal quan- 
tum circuits are shown for two 10 ' 11 and three qubits^ 2 - - — 
found by applying different decomposition schemes to a 
given unitary operator. Based on the general two-qubit 
circuit design, a two-qubit quantum processor is experi- 
mentally realized 15 . However, the realization of a general 
quantum processor and a full-scale quantum computer is 
still an obstacle which requires new theoretical and ex- 
perimental improvements^ 6 -. 

It is known that the realization of quantum logical op- 
erations can be simplified by using the higher dimensional 
Hilbert space a 16 ' 17 . In this paper, using ancilla qubits, we 
describe a new circuit design approach which produces 
two programmable quantum circuit designs. These can 
be further improved to design general large-scale quan- 
tum chips and programmable quantum gate arrays. The 
circuits also support simulation of non-unitary matrices. 
We also show the complexity analysis for the circuits: in 
terms of quantum complexity, they have about the same 
complexity as non-programmable designs which are gen- 
erated by using matrix decompositions in numerical lin- 
ear algebra such as QR decomposition^, the quantum 
Shannon decomposition, the cosine-sine decomposition 
and some other a 19 ' 20 (see ref^S for the comparison and 
the complexities of these methods). In terms of classical 
complexity, since angles for our programmable circuits 
can be determined simply individual matrix elements, 
the classical complexity is much simpler than the decom- 
position methods. 

This paper is organized as follows: After giving the 
general simulation idea, the details of two circuit designs 
implementing this idea are presented. Then the complex- 
ity of the circuits are analyzed in terms of classical and 
quantum complexities. Finally, we discuss the circuit de- 
signs and possible future directions. In the appendix, 
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more computational details related to matrices are pre- 
sented. 



II. THE GENERAL SIMULATION IDEA 

For a given real unitary JJ NxN with N — 2 n , and n is 
the number of qubits, the relationship between the input 

\t/j) = ai\0 . . . 0) H h q.'tv|1 ... 1) and the output \<p) is 

defined as \ip) = U\ip) generating N states: 
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Any system of higher dimension ( ancilla qubits are added 
to the original system) can also be used to generate this 
output on N chosen states with some normalization. Our 
goal is to create a matrix V (shown in EqJ2|) which rep- 
resents the system with the ancilla. We then modify the 
initial input |0)|i/>) to this extended system V (the ini- 
tial state of ancilla is taken as |0) ) by using quantum 
operations such that the application of V to this modi- 
fied input \ip) includes the output given in Eq.([T]) with a 
normalization constant k: 
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where each Vi has some distinct rows of U as their leading 
rows. Adding a sufficient number of ancilla qubits to 
control each Vi uniformly (as shown in FigfT]) permits 
us to produce the circuit equivalent of matrix V in the 
above equation. If we assume that the first row of Vj 
is (or includes) the ith row of U, then we need to use 
(X = N) such Vi blocks as shown in Eq.([2]). 



ancilla 



■/-Vi — Vi —Vx — 



FIG. 1: The number of qubits on the ancilla determines 
the number of V,s and hence the size of V in Eq.([2]). 

The quantum operations to construct the matrix V 
and the operations to modify the input |0) ® \ip) form the 



circuit that simulates the given operator. That means, 
steps to form rows of U in V and also to transform 1 0) | ip) 
to generate the general circuit design for the simu- 
lation of U . One way to formulate these steps and to 
build Vi matrices and the input is as follows: First, 
the system is extended by adding auxiliary qubits. These 
ancilla qubits uniformly control different block quantum 
operations, ViS, on the main n qubits (in this paper, n 
or (n + 1) number of auxiliary qubits are used). After 
the formation of all elements of U which we call the For- 
mation step, the same row elements of U are brought 
to the first row of each Vi which we call the Combina- 
tion step. The input is modified (|0)|^») — > by a 
small circuit such that V\ip) produces an output which 
includes the normalized N states expected from the op- 
eration U\ip). We call this step the Input modification 
step. The measurement results for these N states ex- 
actly simulate U\ip). The circuit design to be found with 
these steps can be drawn as a block circuit diagram (as 
shown in FigJ2]). This approach provides a new way to 
find circuit designs. Hence, we will describe two different 
programmable circuit schemes based on the block circuit 
in FigH 



III. GENERATION OF PROGRAMMABLE CIRCUITS 
A. The First Circuit Design 

In this design, first we create all elements of U at the 
diagonal positions of V by using one rotation gate for 
each element of U, Formation step. In the Combination 
step, the elements on each ith row of U are collected in 
the first row of each Vi . 

Formation Step: In this step, the elements of U 
are tiled across the diagonal of a new higher-dimensional 
matrix Vf. This is a block diagonal matrix with 2x2 
blocks across the diagonal. For each element of U, one 
rotation gate is used. The angular value for the gate is 
determined to form an element of U as its cosine value. 
Controlling such gates in a uniform binary coded fashion 
produces the matrix which has all elements of U on its 
diagonal: 
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27V 2 x 27V 2 



where Cj — cos(arccos(uj)) generating the jth element of 
U. We use (n + 1) number of ancilla qubits to uniformly 
control each Rj, 1 < j < N 2 . 

Combination Step: To bring the same row elements 
of U to the first rows of the V^s, we need a quantum 
operation V c which will produce the matrix V = V c Vf 
represented as: 
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FIG. 2: Block circuit diagram to simulate U by modifying the input |0)|^) to and constructing V in two steps: 
the formation of the elements of U in V and bringing the same row elements in U to the first rows of V*s in V, 
combination. The necessary gates to form V and to also transform 10)1-0) to \tp) will generate the circuit. 
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where K should have a form similar to the following 
matrix: 
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For a system with (n+1) qubits, the single Hadamard 
gates on the first n qubits generate the above matrix 
with k — ±l/\/2™. Hence, V c is the matrix form of this 
operation in the system with (2n + 1) qubits where we 
apply the Hadamard gates to the (n + l)st, nth, 3rd, 
and 2nd qubits from the bottom in the circuit. 

Input modification Step: In the final matrix in 
Eq|4l since the corresponding state for the rows which 
posses the elements of U with the normalization factor k 
are to be assigned as N chosen states simulating U, we 
should modify the input in such a way that the elements 
represented as "-"s between kuij and kv,i(j+u are disre- 
garded. That means the initial input should be trans- 
formed into 1^) by a prior operation to the final matrix 
V so that the corresponding elements in the input to "*" 
elements are set to zero: 

|om->|$ = 

[cti a 2 ■ ■ -ctN 0. . .0] T ->• (6) 
[nati na2 ■•■ ko>n . . . koc\ na 2 ... k«jv] T , 

where k is a normalization constant. It is easy to see that 
this modification can be succeeded by simple Hadamard 
gates on the first n qubits, and sequential swap opera- 
tions between the (n+l)st and the remaining n qubits. 



The equivalent circuit simulating any U is drawn in 
FigGH for n qubit system by adding n+1 ancilla qubits 
and replacing the block circuits in Fig J2] with the explicit 
circuits found above. 

At the end of this circuit, which can be decomposed 
into one- and two-qubit gates by using the decomposi- 
tion technique discussed in Sec lIVl the following set of 
N states exactly simulates the given unitary U after nor- 
malization: 

|0...000-0...0), 
|0 . . . 010 - ... 0), 

(7) 

|1 . . . 110 - ... 0), 

where the dashes are used to separate the main and the 
ancillary. 

In Appendix we give an example of the explicit ma- 
trices used for each step of the algorithm. 

B. The Second Circuit Design 

In the first circuit design, the elements of U are ini- 
tially formed on the diagonal of V by using uniformly 
controlled rotation gates. Here, we take a group of ele- 
ments from a row of U and create them as the leading 
row of small block matrices by preserving the ratios be- 
tween the elements. Using a rotation gate for each two 
of these initial small blocks, we create larger block ma- 
trices which will have more elements of U in their first 
rows. This combination of steps is iteratively done until 
the final V^s with leading rows having the rows of U as 
in Eq.([2]) are constructed. Since the final blocks, V^s, are 
N x N, the matrix V is iV 2 x N 2 ; therefore n qubits 
are needed for the ancilla. The input modification step 
follows the same idea as described for the first design. 
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FIG. 3: The first circuit design for a given general matrix: the initial Hadamards and the SWAPs are to modify the 
input, and the last Hadamards carry the elements to the first rows of ViS (combination step). The uniformly 
controlled quantum gates in the middle form all elements of U on the diagonal of V (formation step) . 



Formation Step: As stated above, instead of forming 
matrix elements at the diagonal positions by using a ro- 
tation gate for each element of U, a group of elements is 
created in the first row of each block with the same ratio 
as those elements in the original matrix. For instance, if 
the initial blocks are of dimension 2 by 2, the first row 
implements two elements, Uij and Uik, of U. Thus, the 
ratio between the first element and the second element 
of a 2 by 2 block matrix is the same as Uij/uik (since 



the block is 2 by 2, the elements of the block matrix are 
the cosine and sine values of an angle 9 X which provides 
the equality cos{6 x ) / sn\{Q x ) = Uij/un-). In our circuit 
designs, we will assume k = j + 1, and so the first row el- 
ements of each block implement the ratios of the elements 
in the same order as the original matrix. Therefore, if the 
first blocks are of dimension d x d; the total number of 
initial blocks will be N / d since each block implements d 
number of elements. The following matrix represents the 
formation step for 2 by 2 initial blocks: 
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where fc's are the normalization constants, and 
the elements of U. The Vi block operations in Fig|4] pro- 
duce a matrix Vf with 4 by 4 block matrices on its diag- 
onal. 

Combination Step: After the formation with ratios, 
blocks are combined using one rotation gate for each pair 
of two blocks so as to form new larger blocks with new 
normalization constants that preserve the original ratios 
of the elements. Each of these new blocks has twice as 



many elements as the former blocks. As an example, we 
will combine two 4 by 4 matrices located on the diagonal 
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of the matrix Vs'. 
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where and k\ and k 2 are the normalization factors. The 
following matrix, V Cs , can be used as a combination ma- 
trix to generate an 8 by 8 larger block from the above 
pair of two 4 by 4 blocks: 
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where c x 
gle to achieve the required ratio. 



an an- 
The matrix multipli- 



cation V Cs Vf. 



produces a matrix with the leading row 
[k x u\ . . . k x us], where k x = sin x x k 2 and k x = cos x x ki- 
lt is easy to see that the matrix V C8 can be written 
as R(26 x ) ® I <g> I. Hence, any such general combination 
matrix can be written as R® I D where D is the size of 
the blocks to be combined by using V c ; and R is a general 
one qubit rotation gate. This means that for the blocks 
operating on c qubits, if we apply a rotation gate to the 
(c + l)st qubit, it will be equivalent in matrix form to the 
operation V c V 2 n+i . Hence, putting single rotation gates 
on (c+ l)st, (c + 2)nd, nth qubits generates an N by 
N matrix. Furthermore, by controlling each V c operation 
(or equivalently single rotation gates, Rs) uniformly by 
the upper qubits in the circuit (see the uniformly con- 
trolled rotation gates located after the Vi block opera- 
tions in FigH]), we can generate TV such separate blocks 
and the following final matrix: 
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Since the resulting rows in each block are unit vectors 
and have the same ratio as the row elements of U, they 
are equal to the corresponding rows of U. (The final 
normalization constants become equal to 1.) 

For the general case, if the initial blocks are operat- 
ing on the last c qubits, we need to use N/2 C uniformly 



controlled rotation gates on each main qubit (excluding 
the last c qubits) in order to recursively combine small 
blocks. At the end, we have N x N blocks whose leading 
rows are the rows of U as shown in Eq.f HTl). 

Input Modification (|0)|^) -> $)) : Mod- 
ification of the input [ai a 2 . ..ajv 0...0] T as 
[ncti . . . KaN\Kcti ■ ■ ■ k(Xn\ ■ ■ ■ \na\ . . . kc(n] t with the 
normalization constant n allows us to simulate U by 
using V in Eq.lfTTj) on the chosen N states: 



|0...000-0...0), 



|1 . . . Ill — ... 0). 



(12) 



This input with k = 1 /V2 N can be produced by applying 
the Hadamard gates to all ancilla qubits at the beginning 
of the circuit. 

Consequently, the general circuit design shown in FigfJ] 
is obtained which is able to simulate any real unitary 
matrix. For more explicit matrix forms and illustrative 
details, please refer to Appendix [Ai and Appendix I A 21 



IV. COMPLEXITY ANALYSIS OF THE CIRCUITS 

In the cases of classical and quantum complexities of 
the circuits explained above, it is easy to see that they 
depend on mostly the costs of uniformly controlled net- 
works such as the one in FigJSa] Such a network con- 
trolled by k qubits can be decomposed in terms of 2 k 
CNOT gates and 2 k single rotation gates^. For instance, 
the circuit as illustrated for k = 2 in Fig[Sa|can be decom- 
posed as in Figl5bl The angle values in the decomposed 
circuit are found to be the solution of the system of the 
linear equation M k 9 = <$: 
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where k is the number of control qubits in the network, 
and the entries of M are defined as: 



Ma = (-1) 6 



(14) 



in which the power term is found by taking the dot prod- 
uct of the standard binary code of the index i — 1, 
and the binary representation of j — 1th gray coded in- 
teger, gj-x- Since M k is a column permuted version of 
the Hadamard matrix, we see that M is unitary. Thus, 
(M k )^ 1 = 2~ k (M k ) T ', and the new angle values in the 
decomposed circuit are the result of the mere matrix vec- 
tor multiplication^: 
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FIG. 4: The second circuit with 4 by 4 initial blocks: The differently controlled quantum gates in the networks, after 
the Vt blocks, combine small blocks and build the N by N blocks at the end. The initial Hadamards are for the 
modification of the input. The Vi blocks are for the formation step. 
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FIG. 5: (a) A gray-coded multi-control network, (b) The decomposition of the gray-coded network in (a) into 

CNOT and single quantum gates. 



A. The complexity of the first circuit design 



2. The quantum Complexity 



1. The Classical Complexity 



In the first circuit diagram (see Figj3]), since there is 
only one such network, we need to multiply the 2 2,i x 2 2 ™ 
matrix by the vector of dimension 2 2 ™ constructed by 
taking the arc-cosines of every element of U. Hence, the 
classical complexity for the first circuit is 0(2 4n ). How- 
ever, since M is the permuted version of the Hadamard 
matrix, by using the fast Hadamard transform 21 , which 
requires O(NlogN) computations for the transform of a 
vector by the Hadamard matrix, this can be achieved in: 



0(2 2n log(2 2n )) = 0(2n2 2n ). 



(16) 



The quantum complexity of the circuit is the number of 
gates required for the decomposition of the network, the 
combination of the blocks and the input modification: 
2 2 ™ CNOT, 2 2 ™ single rotation, 2n Hadamard, and n 
SWAP gates. 



B. The complexity of the second circuit 

The classical and the quantum complexities for the sec- 
ond circuit are determined by the number of networks 
which arc formed by putting the quantum gates in blocks 
controlled uniformly together as shown in Fig|5] and by 
the combination steps. Since the quantum gates in differ- 
ent blocks with the same angles operate for every case of 
the control qubits, putting them together do not produce 
networks. Instead, they need to be applied only once 
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FIG. 6: The circuit in (a) with 4 by 4 initial blocks can be represented as in (b) by using the circuit given in Fig. ([7]). 
Without changing the order of the gates having the same control state, the gates can be moved to form uniformly 
controlled networks as in (c): If a gate has the same angle value for all control states such as the control X gates in 
the circuit, they are equal to a single gate (in the case of X gates in the circuit, only one CNOT is required). 



such as the controlled X gates shown in FigJBcJ Hence, 
if the initial blocks of 2 C by 2 C (operating on c qubits) 
include m different quantum gates (the type of the gates 
are the same, but each requires different angles in dif- 
ferent blocks such as R\ and Rf in Fig IS]), these blocks 

together produce m gray coded networks controlled by 
2 2n-c qubits . 



In addition, in the combination step, we use binary 
coded networks on each main qubit excluding the last c 
qubits to produce N by N blocks. Thus, we will also have 
n — c gray coded networks for the combination step for 
which the numbers of control qubits go down by one from 
one combination step to another (or from one gray-coded 
network to another). The classical and the quantum com- 
plexity will be determined mostly by the decompositions 
of these m + (n — c) networks. 



1. Classical Complexity 

As mentioned above, in the formation step, the com- 
bination of decomposed block circuits together form m 
gray coded networks for m different gate as represented 
for two-qubit blocks in FigJSJ Hence, to find the decom- 
positions of these networks as in Fig|5bl by the formula 
given in Eq. (|15[) . m number of matrix- vector multipli- 
cations are needed: The dimensions of the matrices are 
2 2n_c x 2 2n_c and the dimensions of the vectors are 2 2ra_c . 
Using the fast Hadamard transform, the complexity for 
this part is found to be Of — 0(m(2n— c)(2 2n_c )) instead 
of 0(m(2 2n ~ c ) 2 ) by the naive matrix vector multiplica- 
tion. 

Furthermore, the combination step is the summation 
of the computations done for finding the angles of (n — c) 
gray coded networks (remember that the number of con- 
trol qubits decreases by one from one network to an- 
other). This is equal to 0((2 2 "- c - 1 ) 2 ) + <3((2 2 ™- c - 2 ) 2 ) + 
• • •+0((2 2 ™- c -"+ c ) 2 ) = 0(2 4 "- 2c -2 2 ") by the naive ma- 
trix vector multiplication. By the fast Hadamard trans- 
form, the complexity of the combination step is as fol- 
lows: 
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O c =0((2n - c - l)(2 2 "- c - 1 )) + 0((2n - c- 2)(2 2 ™- c - 2 )) + • • ■ + 0(n2") 

=0(2 x (1 - (2n - c)2 2 "- c - 1 + (2n - c - 1)2 2 ™- C )) - 0(2 x (1 - 712"" 1 + (n - 1)2™)) (17) 
=0((2n - c - 2)2 2 "- c - (n - 2)2") 



Thus, while the total complexity by the naive multipli- by the fast Hadamard transform, it is: 
cation is 

0(2 4n ~ 2c - 2 2 ") + 0(m(2 2 "- c ) 2 ) 

= 0((m + l)2 4 "- 2c -2 2n ), ^ 
I 

O f + O c = 0((m + l)(2n - c)2 2 "- c - 2 2n ~ c+1 - (n - 2)2™)). (19) 



2. The Quantum Complexity 

In terms of the quantum complexity, the analysis fol- 
lows the same structure: as mentioned, to different gates 
in the blocks on c qubits create m networks controlled 
by 2n — c qubits. The decomposition of these networks 
requires m2 2n ~ c CNOT and the same number of single 
gates. 

Since n — c combinations (n — c network) are necessary, 
the complexity of the combination step is the summation 
of n - c terms: 2 2 ™- c ~ 1 + 2 2 "- c - 2 + ■ ■ • + 2 2n - c ~ n+c = 

Then the total CNOT complexity reads as: 



n2n—c 2 r 



to2 2 



$ = (to + l)2 2 "~ c - 2" + $ (20) 



where $ represents the common gates in each block that 
needs to be run only once. 

Example: As an example, the complexity of a general 
4 by 4 block circuit can be found as follows: By using the 
Schmidt decomposition^, any 1 by 4 unit vector u x can 



be decomposed as: u x 



i v? . Since V\ and 



V2 composed of vj and vf vectors are 2 by 2 unitary ma- 
trices, these matrices (with the elements {cos\ and sin\ 
for V\, and cosi and sini for V2) and the coefficients 
satisfying \ai\ 2 + |a 2 | 2 = 1 can be considered as the rota- 
tion gates. For the coefficients, ai and ai are the cosine 
and the sine values of a rotation gate (a\ — cos a and 
02 = sin a ). The resulting decomposition becomes equal 
to the following: 



a 1 cosiCOS2 + a2Sin 1 sin 2 
—a\cos\sin2 + ci2sin\COS2 
—aisin\C0S2 + ci2Cosisin 2 

aisinisiri2 + a2C0S\C0S2 



(21) 



which requires three rotation gates in general. The cir- 
cuit given in Fig J7] forms any u x as the leading row of its 
4 by 4 matrix. 
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FIG. 7: Quantum circuit which is found by following the 
Schmidt decomposition and can generate any vector of 
dimension 4 as the first row of its matrix representation. 



Therefore, taking this circuit to implement the blocks in 
FigH] gives c = $ = 2, and to = 3; hence the CNOT 
complexity of the whole circuit in Fig|4] reads as 2 2n — 
2 n + 2. Also note that if the blocks in the circuit shown 
in FigH] were of dimension 2 by 2, then the complexity 
would be 2 2n - 2™. 



C. Comparison with the Non-Programmable Circuit 
Designs 



The reported non-general circuit decompositions have 
the CNOT complexities ranging from 0(n 3 2 2 ™) to the 
most efficient one |2 2n — |2". The proven lower bound 
for the CNOT complexity is (2 2 "~ 2 -3n/4- 1 /4) without 
using any auxiliary qubits 2 ^. Even though the circuit 
designs given in this paper are general and fixed size for 
any operator, their complexities are greater by roughly a 
factor of 2 compared to those nonprogrammable circuits. 
In addition, if we can make to less than or equal to 2 C ~ 2 , 
then we can also go below the lower bound. This is likely 
to happen because the common quantum gates in the 
blocks (as two CNOTs in 4 by 4 blocks) do not affect 
the upper bound of the complexity. Hence, by benefiting 
from this property, the lower bound complexity may be 
reduced with the use of higher Hilbert spaces. 
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V. DISCUSSION AND CONCLUSION 

A. Programmable Quantum Chips 

The circuit designs given here are independent of the 
type of operator; hence they can be used to design general 
purpose quantum processors and quantum chips in which 
the angles arc set by a preprocessing unit. They can also 
help in the design of possible quantum gate arrays 7 -. In 
addition, because the circuit designs are highly depen- 
dent on the matrix elements, for the application specific 
circuits aimed to implement particular types of systems, 
any level of sparsity in the system may reduce the number 
of gates significantly in the general design; hence, more 
efficient quantum chips can be built for particular uses. 
For instance, if half of each of the row elements are zero 
in the given matrix, considering the first approach, the 
blocks at the end of the combination steps can be made 
to have the dimension (N/2 x N/2). Hence, this will lead 
the circuit to require fewer combination steps (the num- 
ber of qubits in ancilla is reduced by one), which lowers 
the both classical and CNOT complexities and makes any 
possible fabrication easier. 



B. Finding Angles 

In the case of finding the angle values on classical com- 
puters for a given unitary operator, the process can be 
parallelized conveniently to find the angles. For instance, 
the distribution of each row to the different cores may be 
one way of parallelization of the method. This can be 
further improved and designed in terms of more small 
blocks. And so the computation time to generate angles 
for both circuits can be very fast. 

The combination procedure described for both circuit 
designing processes can be further improved to combine 
circuits for different unitary operations by considering 
them as initial blocks. One of the individual blocks used 
to generate a row of the given matrix can also be used 
as the state preparation circuit (for instance FigLTJ) for 
an arbitrary circuit. Furthermore, the circuits generated 
by the first approach have high resemblance to the qubus 
quantum computer 22 . Similar ideas can be used to imple- 
ment circuit design techniques for this type of quantum 
computers as well. 



C. Complex Cases 

It is important to note that in this paper, even though 
real matrices are considered, it is straightforward to im- 
plement any complex case as well by considering each 
rotation gate as also being able to produce any complex 
element of a unitary matrix in the first circuit design. 
This may require more than one simple rotation gate, 
but it shall not increase the upper bound of the quantum 



complexity. However, the modification for the second cir- 
cuit may not be as simple as for the first one: this may 
require additional gates during the combination and for- 
mation steps. 



D. Simulation of Molecular Hamiltonians 

The exponential growth of computational cost with the 
number of atoms is a huge computational challenge for 
the exact quantum chemistry calculations. Even for a 
simple molecule like methanol, using only the 6-31G** 
basis for the valence electrons, there are 50 orbitals. The 
18 valence electrons can be distributed in these orbitals in 
any way that satisfies the Pauli exclusion principle. This 
leads to about 10 17 possible configurations making an ex- 
act or Full Configuration Interaction (FCI) calculation al- 
most impossible on classical computers^. However, it has 
been shown that a quantum computer can be used to esti- 
mate the ground and excited state energies of molecules 
efficiently 6 .^ - — . For the simulation of a quantum sys- 
tem, it is necessary to find an equivalent quantum circuit 
to the unitary propagator of the Hamiltonian represent- 
ing that system. The molecular electronic Hamiltonian, 
in the Born-Oppcnheimer approximation, is described in 
the second quantization form a o 6 ' 16 i 27 : 



H 



^ ] hpq a \> a q 



2 ^ 

pqrs 



(22) 



where the matrix elements h pq and h pqrs are the set of 
one- and two-electron integrals, and aj and aj are the 
spinless fermionic annihilation and creation operators. 
Let the set of single-particle spatial functions constitute 
the molecular orbitals {<£(r)}fe=i an d the set of spin or- 
bitals {x( x )}p^fi be defined with \p = <Pi<Ti and the set 
of space-spin coordinates x = (r, uS) where <7j is a spin 
function. The one-electron integral is defined as 6 : 



(23) 




XgOO 

= (<p p | | Vq )6^ q 

and the two electron integral is: 

u fj a X;(xi)xj(x2)x s (xi)xr(x 2 ) 

hpqrs = / CtXlrfX2 



ri2 (24) 

= (ip p | (lf q | H {2) | (fir) | ip s )Sa p a q Sa r a sl 

where r QX is the distance between the a th nucleus and the 
electron, r\% is the distance between electrons, V 2 is the 
Laplacian of the electron spatial coordinates, and Xp( x ) 
is a selected single-particle basis: \ p — ip p u p , Xq — ( Pq a q^ 
Xr = Vr<7r, and Xs = ip s (J s - 

To describe the hydrogen molecule in mini- 
mal basis which is the minimum number of spa- 
tial functions required to describe the system, 
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one spatial function is needed per atom denoted 
Phi an d PH2- The molecular spatial-orbitals 
are denned by symmetry: (p g = tpni + VH2 and 
<Pu = Phi — f>H2] which correspond to four spinorbitals: 

Ixi) = \<Pg)\(x),\X2) = ks>l/3),|X3> = \<Pu)\a), and 

I 



and 



The spatial integral values evaluated for atomic dis- 
tance 1.401a.u., the Hamiltonian matrix found as a 16 
by 16 matrix^, so 4 qubits are required to implement the 
unitary propagator of this Hamiltonian which is found 
from e~ lHt by setting t = 1. (see the note^ 3 -). 

The accuracy of the circuit design for the unitary prop- 
agator also determines the accuracy of the simulation. 
The generation of quantum circuits by using matrix de- 
composition techniques or global optimization methods 34 
(as done for water and hydrogen molecules in ref£) re- 
quires searching a huge complex space and simulation 
of the unitary matrices of quantum systems on classical 
computers. For large matrices, this hinders the efficiency, 
and hence, the accuracy of the circuits. Since the angles 
for the rotation gates in our circuits are determined from 
the matrix elements directly (for instance in the first de- 
sign, Fig|3]), we only take the arcosine of the values, and 
generating these angles requires only a few computations; 
the accuracy and the efficiency of the circuits are always 
high. This helps to get very accurate circuit designs for 
the simulation of quantum systems. For instance, for the 
16 by 16 unitary propagator of hydrogen molecule given 
in ref£, nine qubits are required in the circuit scheme 
given in FigJH] Since the unitary propagator is highly 
sparse and has only 19 nonzero elements, most of the 
uniformly controlled gates in the circuit will be iden- 
tity except 19 of them. Hence, in AppendbfBI we have 
shown how to reduce the number of qubits to 6 qubits, 
FiglU We give the rotation values for the gates in Table 
UJ Therefore, since our circuit designs have fixed designs, 
using different basis sets or parameters to compute the 
Hamiltonian will not change the circuit design and the 
accuracy of it. 

In summary, we present general programmable quan- 
tum circuits which can simulate any given 2™ by 2 n real 
matrix. Because of the structure of the circuits, they can 
be used to fabricate specific or general purpose quantum 
chips and processors. Since the circuit designs are highly 
dependent on the matrix elements; for the application 
specific circuits aimed to implement particular type of 
systems, any level of sparsity in the system may reduce 



|X4) = \<p u )\(3). The STO-3G basis is used to evaluate 
the spatial integrals of the Hamiltonian which is defined 
as % = + H^ 2 \ where since h pqrs = h pqsr , and 
are simplified asM&21: 



I 

the number of gates significantly. In addition, we show 
that the generation of circuits with the complexity less 
than the lower bound is possible by making m < 2 C ~ 2 
and increasing $ in the given complexity. 
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Appendix A: The explicit illustration of the steps 

Here, we detail the implementation of the input mod- 
ification, the formation (V/), and the combination (V c ) 
steps. A sketch of the matrix format of the operations 
can be found in Eq. (|A3p - for the one-qubit case in the 
first circuit design - and Eq. (IA8p and Eq. ([A9[) - for the 
two-qubit case in the second circuit design; here blanks 
denote zeros and dots denote matrix parts of no interest 
for the final operation. 



1. First circuit design 

Starting with an arbitrary input, \ip) = (ao, ai) T , and 
the following arbitrary unitary matrix: 

U = ( u °° U01 ) (Al) 
\u w liny 



= hiia[ai + hi2a\a 2 + h 33 ala 3 + h^a^a^, (25) 



H (2) = hi22ia\a\a 2 ai + ^344303040403 + ft^ialal^ai + /i 2 332<4 a 3 a 3 a 2 + (^1331 - ^1313)^1030301 
+ (/12442 - /i2424)a 2 a 4 a 4 a 2 + (^423) {a\a\a2a 3 + a\a\aiai) + (hi243){a\ a\a 4 a 3 + aJja^ai). 



(26) 
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the first method requires 2n + 1 = 3 qubits for the simu- 
lation with the input: 



fa \ 
ai 








^initial) = |0) ® |0> ® |V) 



(A2) 



The followings represent the formation matrix, V/ , the 
matrix after the combination step, V and the modified 
input, \if>): 



/"oo 



Vf = 



"01 



"10 



Uu ■ 



■I 



/"oo ■ "oi 



U W ■ Mn 



1 





Oil 



Q'O 



ax 
\0J 



(A3) 



r 



For illustration purposes, below we also present full 
forms of some of the operators and the output vector for 



the same case: 

The full form of the resulting matrix from the forma- 
tion step is as follows: 



Vf = 



"00 



'on 



u 00 


"00 










































"01 



V 1 _ u 
o 

o 






x oi 



"01 











"10 


V 1 - 





-V 1 -«io 


"10 





















no 










"11 



'ii 



"ii 



(A4) 



I 

The combination matrix Vc and the matrix for input modification Vm arc defined as: 
I 



V c 



1 





1 

V2 

















\ 




( 


1 

V2 











1 

V2 














1 





1 

V2 


























1 

V2 











1 

V2 





1 





1 

V2 


























1 

V2 











1 











1 





1 

V2 





























1 

V2 











1 














1 





1 









1 

V2 











1 


























1 





1 

V2 














1 

V2 











1 

V2 

















1 

V2 





1 

V2 














1 

V2 











1 

V2 























1 

V2 





1 

V2 


J 




\ 











1 

V2 











1 

V2 



(A5) 
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For the initial input |Vtoiai) 
output state \ip final) becomes: 



\4>final) = V c VfV m \lpi n iti a l) 



as in Eq. (|A2[) . the final 2. Second circuit design 



aouoo + amoi 

CtiUoi 



u, 



00 

a u 00 - 



a u w + aiun 



-a y/l - u 2 w - aiy/l - 
a Mio - aiwn 



\-a y/l-ul + ai y/l- u\J 



(A6) 



Clearly the normalized states |00 
simulate the original given system. 



0) and 1 10 - 0) 



For the same case, since the second circuit design ini- 
tially works at least a pair of matrix elements, it will 
create the unitary at the initial step. There will be no 
need for the combination step. Hence, the output will 
be simulated on the states |00) and |10). For two qubit 
system below, the simulation goes as follows: 



U = 



'Uqo Uqi U 2 "03 N 

Uiq Uu Ul2 U13 

U 2 "21 "22 "23 

V"30 "31 "32 "33/ 



(A7) 



In the formation step, if we use 4 by 4 blocks as shown 
in Fig|4j there will be no need for the combination step 
since we will have already formed the rows of U at the 
formation step. However, if we use 2 by 2 initial blocks, 
we need to use one rotation gate for each pair of the 
elements, then the combination step. Thus, at the end 
of the formation step, we get the following matrix: 



/ fc "oo A: "oi 



fcl"02 &1M03 



fc 2 "io k 2 u 



11 



fc3"l2 fc 3 Mia 



/C4"20 fcl"21 



&5"22 k 5 U 2 3 



k6U 30 k 6 u 31 



k 7 u 32 k 7 u 33 
(A8) 



/ 



where kiS are the normalization constants. After the se- input, we get the following matrix and the modified in- 
quential combination steps and the modification on the put: 



13 



/ Uqo Uqi U 02 U 03 



V = 



UlQ Un U12 U13 



U 2 U 2 1 U22 U23 



U30 U31 ^32 U33 



= 1/2 



/ a °\ 

Oil 

0L3 

«0 

O'l 

oi2 
0L3 

ao 
at\ 
0L2 
013 
«o 
ai 

\a 3 J 



The final state is equivalent to \ipfinai) = V\ip). In 
\^ fined), the states |0000), |0100), |1000), and |1100) are 
the respective states that simulate the original given uni- 
tary matrix. 



Appendix B: Explicit Circuit for the unitary propagator of 
the Hydrogen Molecule 



gates where which elements to be switched is determined 
by the control qubits. And the input should be also per- 
muted prior to the circuit. This can be done by simply 
switching the input for the qubits. At the end of this cir- 
cuit, since the leading rows of 4 by 4 matrices simulate 
the unitary, we get the simulation result from the states 
10), 14), 18), 112) 160). 



As mentioned, the unitary matrix, Uh 2 , for the hy- 
drogen molecule has 19 nonzero elements, 15 of them 
located at the diagonal. Since the unitary is 16 by 16 
we need 4 main and 5 ancilla qubits for the first cir- 
cuit design given in Figj3] And the uniformly controlled 
rotation gates in the formation steps are the R y gates 
followed by R z gates where we use identity for the zero 
elements. However, we can benefit from the sparsity of 
the matrix and reduce the number of ancilla to 2 qubits 
instead of 5: The non diagonal matrix elements are lo- 
cated at (13, 4), (4, 13), (7, 10), and (10,7), where 
are the row and column indices. We apply a permuta- 
tion matrix, P, to reduce the bandwidth of the matrix. 
PUh 2 takes non-diagonal elements (13, 4), (4, 13), (7, 10), 
and (10, 7) to (5, 4), (4, 5), (7, 8), and (8, 7) which creates 
another unitary, Uh 2 ■ Uh 2 is a structured matrix where 
all the elements are located on the (i, i), + 1), or 
(i — 1, i) positions. Hence, we can use 2 qubits for an- 
cilla and 4 qubits for the main to create matrix V having 
4x4 block matrices on its diagonal by using only one 
Hadamard gate in the combination step. In the forma- 
tion step, the control qubits for R y gates and R z gates 
are determined to form the couple of and + 1), 
or (i — 1, i) and (i, i) elements on the first row of these 4 
by 4 matrices. The angle values are determined from the 
polar representation of each element and given in Table[I] 
The circuit for Uh 2 is shown in Fig [8] where R represents 
a combination of an R y and an R z gates. Please note 
that the circuit equivalences of the permutation matrices 
such as P are the combinations of multi control CNOT 



ancilla 



0) — \H_ 
0) X X X ),< 




- En 



H - 



FIG. 8: The circuit for the simulation of the hydrogen 
molecule. The angle values for the rotation gates are 
determined to create the elements of Uh 2 - There is only 
19 rotation gates, the rest is X gates in order to get the 
right order for the elements after the combination . For 
diagonal elements oHJh 2 , these rotations are only around 
z-axis. For nonzero-diagonal elements, rotation about z- 
axis followed by rotations about y-axis. The angles for 
these gates given in Table Q] 
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TABLE I: Parameters for the Rotation Gates 



State of 


Matrix Elements 


Angle 


Angle 


Control 




for R z 


for R y 


Qubits 








00000 


0.9788-0. 2049i 


-0.4127 





00010 


0.3987+0.9171i 


2.3214 





00100 


0.3987+0.917H 


2.3214 





00110 


-0.2607+0.9517i 


3.6763 


0.3253 


00111 


0.1401-0.0817i 


-1.0559 


2.8158 


01000 


0.1401-0.0817i 


-1.0559 


2.8158 


01001 


-0.2607+0.9517i 


3.6763 


0.3253 


01011 


0.9354+0. 3535i 


0.7226 





01101 


0.3189+0.9478i 


2.4925 





OHIO 


0.4766+0.8604i 


2.1299 


0.3629 


01111 


-0.1577+0.0874i 


5.271 


2.779 


10000 


-0.1577+0.0874i 


5.271 


2.779 


10001 


0.4766+0.8604i 


2.1299 


0.3629 


10011 


0.3130+0.9498i 


2.5049 





10101 


0.3189+0.9478i 


2.4925 





10111 


0.3130+0.9498i 


2.5049 





11001 


0.9569+0.2410i 


0.4934 





11011 


0.8889+0.4582i 


0.9519 





11101 


0.8889+0.4582i 


0.9519 





11111 


1 









BIBLIOGRAPHY 

1 C. Bobda, Introduction to reconfigurable computing: Architec- 
tures, algorithms, and applications (Springer Publishing Com- 
pany, Incorporated, 2007) 

2 E. Farm, J. Goldstone, S. Gutmann, J. Lapan, A. Lundgren, and 
D. Preda. [Scien ce 292, 472 (2001) 

3 M. A. Nielsen and I. L. Chuang, Quantum Computation and 
Quantum Information (Cambridge University Press, 2000) 

4 P. Kaye, R. Laflamme, and M. Mosca, An Introduction to Quan- 
tum Computing (Oxford, 2007) 

5 C. P. Williams, Explorations in quantum computing; 2nd ed., 
Texts in Computer Science (Springer, London, 2011) 

6 A. Daskin and S. Kais, |J. Chem. Phys.|134, 144112 (2011) 

7 M. A. Nielsen and I. L. Chuang, |Phys' Rev. Lett.| 79, 321 (Jul 
1997) 

8 J. P. Paz and A. Roncaglia, |Phys. Rev. A] 68, 052316 (Nov 2003) 
9 P. B. M. Sousa and R. V. Ramos, Quantum Info. Comput. 7, 228 

(Mar. 2007), ISSN 1533-7146 

10 G. Vidal and C. M. Dawson, [Phys. Rev. A| 69, 10301 (Jan 2004) 
"J. Zhang, J. Vala, S. Sastry, and K. B. Whaley, |Phys. Rev. LetE] 
91, 027903 (Jul 2003) 



12 M. Mottonen, J. J. Vartiainen, V. Bergholm, and M. M. Salomaa, 

[Phys. Rev. Lett.| 93, 130502 (Sep 2004) 
13 F. Vatan and C. P. Williams, "Realization of a General Three- 

Qubit Quantum Gate," (Jan. 2004), ||arXiv:quant-ph/0401178l 
14 W. Hai-Rui, D. Yao-Min, and Zhang-Jie, Chin. Phys. Lett. 25, 

3107 (2008) 

15 D. Hanneke, J. P. Home, J. D. Jost, J. M. Amini, D. Leibfried, 
and D. J. Wineland, |Nature Physics] 6, 13 (Nov. 2009), ISSN 
1745-2473 

16 B. P. Lanyon, M. Barbieri, M. P. Almeida, T. Jennewein, T. C. 
Ralph, K. J. Resch, G. J. Pryde, J. L. O/'Bricn, A. Gilchrist, and 

A. G. White, [Nature Physics| 5 , 134 (Dec. 2009), ISSN 1745-2473 
17 T. D. Mackay, S. D. Bartlett, L. T. Stephenson, and B. C. 

Sanders, J. Phys. A: Math. Theor. 35, 2745 (2002) 
18 G. H. Golub and C. F. Van Loan, Matrix computations (3rd ed.) 

(Johns Hopkins University Press, Baltimore, MD, USA, 1996) 

ISBN 0801854148 
19 B. Drury and P. Love, Journal of Physics A: Mathematical and 

Theoretical 41, 395305 (2008) 
20 V. V. Shende, S. S. Bullock, and I. L. Markov, IEEE Trans, on 

CAD 25, 1000 (2006) 
21 B. J. Fino and V. R. Algazi, IEEE Trans. Comput. 25, 1142 

(Nov. 1976), ISSN 0018-9340 
22 K. L. Brown, S. De, V. M. Kendon, and W. J. Munro, New 

Journal of Physics 13, 095 007 (2011) 
23 D. S. Abrams and S. Lloyd, [Phys. Rev. LettJ. 83, 5162 (Dec 1999) 
24 A. Aspuru-Guzik, A. D. Dutoi, P. J. Love, and M. Head-Gordon, 

ISciencel 309, 1704 (2005) 
25 L. Veis and J. Pittner, J. Chem. Phys. 133, 194106 (2010) 
26 H. Wang, S. Kais, A. Aspuru-Guzik, and M. R. Hoffmann, 

IPhys. Chem. Chem. PhysT] 10, 5388 (September 2008) 
2 \L D! Whitfield, T. Biamonte, and A. Aspuru-Guzik, 

[Molecular Physics| 109, 735 (2011) 
28 A. Papageorgiou and C. Zhang, Quantum Information Processing 

11, 541 (Apr. 2012), ISSN 1570-0755 
29 B. P. Lanyon, J. D. Whitfield, G. G. Gillett, M. E. Gog- 
gin, M. P. Almeida, I. Kassal, J. D. Biamonte, M. Mohseni, 

B. J. Powell, M. Barbieri, A. Aspuru-Guzik, and A. G. White, 



Nature Chemistry 2, 106 (Jan. 2010), ISSN 1755-4330 



30 I. Kassal, S. P. Jordan, P. J. Love, M. Mohseni, and A. Aspuru- 
Guzik, Proceedings of the National Academy of Sciences 105, 
18681 (2008) 

31 L. Veis, J. Visn ak, T. Fleig, S. Knecht, T. Saue, L. Visscher, and 
J. c. v. Pittner, [Phys. Rev. A] 85, 030304 (Mar 2012) 

32 I. Kassal, J. D. Whitfield , A. Perdomo-Ortiz, M.-H. Yu ng, and 
A. Aspuru-Guzik, Annual Review of Physical Chemistry 62, 185 
(2011) 

33 For the matrix exponentiation, the MATLAB expm function 
which uses the Pade approximation with scaling and squaring 
is useo— 

34 A. Daskin and S. Kais, Mol. Phys. 109, 761 (2011) 



